Skip to content

Conversation

@szehon-ho
Copy link
Member

@szehon-ho szehon-ho commented Apr 15, 2022

Backport of #4520 to 0.13.x branch, which returns sometimes wrong results for Files metadata table queries with filters, for V2 tables with specific partition spec evolutions involving dropping fields, and IndexOutOfBoundException in other cases where partition spec has changed.

Adapted to old code in the branch.

This works, except some test case fails in querying dropped partition fields in V1 tables because of related but different issue #3411 . So one option is to backport that one first for the test, though fix is orthogonal.

@szehon-ho szehon-ho changed the title [0.13 Backport] Core: Fix filter pushdown for metadata tables with evolved specs (#4520) [0.13] Core: Fix filter pushdown for metadata tables with evolved specs (#4520) Apr 15, 2022
@szehon-ho szehon-ho force-pushed the 13_metadata_predicate_evolving_spec branch 2 times, most recently from 2ab0361 to 7cb2b6e Compare April 18, 2022 22:24
@szehon-ho
Copy link
Member Author

@kbendick @rdblue can this backport go in for 0.13.2?

@kbendick
Copy link
Contributor

Yes, at first glance I do think they both should be backported.

The release manager is @nastra fyi, but I personally think if people are encountering this and it’s an obvious bug, we backport it.

I’d personally also like to see #3411 backported, as I’ve recently encountered issues on partition summaries using the void transform (which could be unrelated to both as I need to take a closer look): #4689

But if it’s ready I don’t see why not. Any ideas on a possible timeline for both? I know that’s highly dependent on things you can’t control, but we’ve admittedly delayed 0.13.2 for quite some time.

TLDR - On first pass I would think to include both. But I do have some concerns around holding up the release too much.

I’ll spend time on a proper pass over this tomorrow :)

@szehon-ho
Copy link
Member Author

szehon-ho commented May 12, 2022

Hi @kbendick yea I requested the backport for #3411 already and @ConeyLiu did this in : #4572 and I merged it.

For this one, I think @rdblue was ok and marked the original issue #4520 for 0.13.2 backport by the tag, so I did the backport pr by convenience if it can be merged.

(by the way the problem with the https://github.com/apache/iceberg/milestone/18 tag is , a lot of github issues are marked 'closed' but aren't really backported).

@nastra nastra added this to the Iceberg 0.13.2 Release milestone May 16, 2022
@nastra nastra requested a review from rdblue May 18, 2022 05:49
Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the combination of #4520, the other PRs that were backported, and this, this seems good.

Given it's been some time and other things have been added, I'd suggest rebasing. I'd also consider porting the tests for the other Spark versions to this PR, just so we can ensure that they work. But barring that (or any disagreement with that), I'm +1 on this. The added Spark 3.2 tests are pretty comprehensive.

Thank you @szehon-ho!

@szehon-ho szehon-ho force-pushed the 13_metadata_predicate_evolving_spec branch from 7cb2b6e to 0ad1a29 Compare May 23, 2022 22:58
@szehon-ho
Copy link
Member Author

Thanks @kbendick for the review, rebased. Re: Spark versions, the main fix is in core so I think the Spark side is agnostic (I added the Spark 3.2 test mainly as an end-to-end).

Anyway it's the same fix as on the master branch to #4520, and I can look at adding some tests for other Spark versions over there.

@rdblue rdblue merged commit 5b3e462 into apache:0.13.x May 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants